This is the report for the final project of the Advanced Machine Learning course by professor Jeremy Bolton. GitHub Repository for the code: Data Gatherer (C#): https://github.com/aliyektaie/PhonemeDetectionDataGatherer Main Model (python): https://github.com/aliyektaie/PhonemeDetectionDeepNerualNetwork References: https://cmusphinx.github.io/wiki/phonemerecognition/ https://pdfs.semanticscholar.org/3866/55ee41444c75fd04ea116a05b4a20423d55a.pdf https://medium.com/@ageitgey/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with-deep-learning-28293c162f7a http://www.cs.toronto.edu/~graves/icml_2006.pdf https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html https://www.dlology.com/blog/how-to-train-a-keras-model-to-recognize-variable-length-text/ https://arxiv.org/pdf/1901.07957.pdf? http://www.practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html http://www.odyssey2016.org/papers/slides/24_Fri/02%20-%20Keynote_3/Keynote_3.pdf http://luthuli.cs.uiuc.edu/~daf/courses/cs-498-daf-ps/lecture%208%20-%20audio%20features2.pdf https://arxiv.org/pdf/1603.00982.pdf https://arxiv.org/pdf/1712.02898.pdf https://www.youtube.com/watch?v=9dXiAecyJrY http://www.practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4272376/ https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly https://skymind.ai/wiki/open-datasets https://www.youtube.com/watch?v=HyUtT_z-cms&t=4s https://distill.pub/2017/ctc/